The success of any locking policy depends on the security of the locks. Locks must be impervious to careless or malicious interference from other processes or users. If not secure from interference, it is a trivial matter to defeat the locks and open programs to unpredictable behaviour. Deleting all the lock inodes suffices to subvert the locking mechanism.
Locks may also be defeated by setting the parameters ExpireAfter and IfElapsed to zero. In the former case, proper exclusion of contentious processes will be disabled, and in the latter protection from spamming will be disabled.
|
It is assumed that users of the locks will not sabotage themselves by setting these parameters with silly values. It would be an interesting investigation to see whether optimal values of these parameters could be found for a specific type of atom, and whether the values could even be determined automatically.
|
In order to deal with atoms which frequently overrun their allotted time, we may note a rule of thumb, namely that ExpireAfter should generally be greater than IfElapsed. If ExpireAfter < IfElapsed, expiry will occur every time a new atom is started after an overrun. This is probably too soon, since the aim is to give the atoms a chance to complete their work.
The function of the active locks is to enforce a correct or sensible interleaving of the atomic operations. The optimal definition of atoms can play a key role in determining the correctness of behaviour. The so-called locking granularity is central to this issue. A central assumption in our treatment here is that the atoms themselves do not lead to subversive or incorrect behaviour. No locking policy can effectively restrict what happens within the atoms.
To illustrate the importance of granularity, consider an extreme
example in which two concurrent threads contain a circular wait loop.
Suppose two threads each run at regular intervals (figure
).
Thread #1 performs two operations in sequence: the first is to sleep until object X is created, the second is to create object Y. In thread #2, the operation sleeps until Y is created and then object X is created. Clearly neither thread can proceed in this circular wait loop and deadlock ensues. Let us now consider the two alternative ways of locking these actions and the resulting behaviour. If we lock both actions as a single atom, then expiry will cause the threads to die and be restarted after a certain time. However, each time the threads are started, they fall into the same trap, since they can never proceed past the first operation. If, on the other hand, we lock each operation as separate atoms, the deadlock can be broken provided the locks are correctly removed from the killed process and the anti-spamming locks are updated. Then the scenario is as follows. The first time the threads run, they fall into deadlock. After a certain time, however, the threads expire and one or more threads is killed when a new process tries to run the atom. If we assume that IfElapsed is greater than Δt, the scheduling interval of the program then, as the new threads start, insufficient time will have elapsed since the last lock was written for each thread, and the first operation will not be executed. This allows the offending atom to be hopped-over and the deadlock will be circumvented.
The assumptions in this scenario are clear:
No greater assurances against deadlock can be given, nor do we attempt to cover every avenue of circular dependency. The possible cases are quite complicated. If silly values are chosen for the parameters IfElapsed and ExpireAfter, we can theoretically end up in a deadlock situation. For our purpose of utilizing locks in autonomous system administration, the likelihood of such strange loops is small and of mainly theoretical interest. We therefore decline to analyze the problem further in this context, but end with the following claim. If Δt is the scheduling interval (the interval at which you expect to re-run atoms), then
All of these theoretical diversions should not detract from the real intention of parameters: namely to provide reasonable protection from unforeseen conditions. For normal script execution, on an hourly basis, we recommend values approximately as follows:
Δt | 1 hour |
IfElapsed | 15 mins |
ExpireAfter | 1 hour 30 mins |